Molecular mechanism underlying the increased risk of colorectal cancer metastasis caused by single nucleotide polymorphisms in LI-cadherin gene

LI-cadherin is a member of the cadherin superfamily. LI-cadherin mediates Ca2+-dependent cell–cell adhesion through homodimerization. A previous study reported two single nucleotide polymorphisms (SNPs) in the LI-cadherin-coding gene (CDH17). These SNPs correspond to the amino acid changes of Lys115 to Glu and Glu739 to Ala. Patients with colorectal cancer carrying these SNPs are reported to have a higher risk of lymph node metastasis than patients without the SNPs. Although proteins associated with metastasis have been identified, the molecular mechanisms underlying the functions of these proteins remain unclear, making it difficult to develop effective strategies to prevent metastasis. In this study, we employed biochemical assays and molecular dynamics (MD) simulations to elucidate the molecular mechanisms by which the amino acid changes caused by the SNPs in the LI-cadherin-coding gene increase the risk of metastasis. Cell aggregation assays showed that the amino acid changes weakened the LI-cadherin-dependent cell–cell adhesion. In vitro assays demonstrated a decrease in homodimerization tendency and MD simulations suggested an alteration in the intramolecular hydrogen bond network by the mutation of Lys115. Taken together, our results indicate that the increased risk of lymph node metastasis is due to weakened cell–cell adhesion caused by the decrease in homodimerization tendency.


Results
Effect of amino acid changes on LI-cadherin-dependent cell-cell adhesion. First, we validated the effects of amino acid changes caused by the SNPs at the cellular level. Considering that cell-cell adhesion mediated by LI-cadherin is maintained by the homodimerization of LI-cadherin 9 , cell aggregation assays were employed to compare the cell-cell adhesion ability of the Chinese hamster ovary (CHO) cells expressing WT or mutant LI-cadherin 9,21,22 . Site-directed mutagenesis was performed to introduce a single or double mutation into the full-length plasmid of EC1-7 fused with monomeric GFP. Single mutations (K115E or E739A) and double mutations (K115E and E739A) were introduced. The established cells were termed K115E-CHO, E739A-CHO, and 2mut-CHO (Supplementary Figure S1 and Table S1). The cell-cell adhesion abilities of these cells, cells expressing LI-cadherin WT (WT-CHO), and non-transfected mock cells were compared by performing cell aggregation assays 9,21,22 and analyzing the size distribution of cell aggregates using micro-flow imaging (MFI).
All cell types expressing LI-cadherin generated cell aggregates. However, the size distribution of cell aggregates was different among the constructs (Fig. 1). Cells expressing LI-cadherin mutants generated a larger number of particles than WT-CHO between 25 and 40 µm. In contrast, the number of particles of 40 µm or greater was smaller than that of WT-CHO. The difference in size distribution from that of WT-CHO was more significant when E739A-CHO or 2mut-CHO was used. These results indicate that the mutations caused by the SNPs still allowed the formation of cell aggregates, but at the same time, the mutations inhibited the formation of larger cell aggregates. The Glu739 mutation had a more significant effect than the Lys115 mutation. www.nature.com/scientificreports/ We then performed in vitro and in silico assays to validate the mechanisms underlying decreased cell-cell adhesion ability. Although the mutation of Glu739 had a more significant effect than the mutation of Lys115, we were not able to perform further experimental assays or simulations to investigate the change in molecular characteristics caused by the mutation E739A because the expression level of the recombinant protein containing Glu739, was low, and the crystal structure of the domain containing Glu739, was not available. However, a previous study has suggested that the SNP corresponding to the amino acid change of Lys115 to Glu also has some impact on metastatic potency, depending on the statistical methods employed to analyze the data 18 . Therefore, in this study, we focused on biochemical assays and molecular dynamics (MD) simulations of constructs containing Lys115.
Comparison of homodimerization tendency. Homodimerization of the first four N-terminal domains, EC1-4, is the fundamental step in cell-cell adhesion mediated by LI-cadherin 9 . To validate whether the mutation affected the function of LI-cadherin, the homodimerization tendency of WT and K115E of EC1-4 was investigated. Sedimentation velocity-analytical ultracentrifugation (SV-AUC) was used to measure the dissociation constant (K D ) of the homodimer. The K D of EC1-4K115E homodimer was 51.6 µM (Fig. 2), whereas that of EC1-4WT homodimer was 39.8 µM 9 . This result indicates a slight decrease in the homodimerization tendency due to the mutation.
The first two N-terminal domains, EC1-2, also form a homodimer but cannot maintain LI-cadherin-dependent cell-cell adhesion 9 . Considering the possibility that the EC1-2 homodimer may be formed in a certain context in the human body, we also performed SV-AUC measurements of the K115E mutant of EC1-2. The K D value of the EC1-2K115E was 343 µM, whereas the K D value of the EC1-2WT was 75.0 µM 9 , showing that the mutation also decreases the homodimerization tendency of EC1-2.
Lys115 is located on the opposite side of the interface of the EC1-4 homodimer and does not directly contribute to the formation of the homodimer (Fig. 3). The architecture of the EC1-2 homodimer expressed in E. coli has been reported previously 23 . The authors pointed out that N-glycans bound to EC2 seemed to hamper the homodimer formation. As EC1-2 used in our experiments was expressed in mammalian cells and N-glycans should be conjugated to EC2 9 , the homodimer architecture formed during our experiment is likely different from and EC1-2K115E homodimer was 51.6 µM and 343 µM, respectively. The SV-AUC data of EC1-4WT and EC1-2WT are reported previously 9 . K D for EC1-2K115E was determined assuming that the s-value of EC1-2K115E is the same as that of EC1-2 homodimer 9 . Impact of Lys115 to Glu mutation on the structure of LI-cadherin. To explore how the mutation K115E decreased the homodimerization of EC1-4 and EC1-2 at the molecular level, the secondary structures of WT and K115E of EC1-4 and EC1-2 were investigated using circular dichroism (CD) spectroscopy. The CD spectra of all constructs exhibited a large negative peak around 216 nm, which is the typical spectrum of β-sheetrich proteins (Fig. 4A,B), and were consistent with the crystal structure of the EC1-4 homodimer, which showed that EC1, 2, 3, and 4 were β-sheet-rich 9 . There was no significant difference between the spectra of the WT and K115E.
The measurements of thermal stability using differential scanning calorimetry (DSC) also showed that there was no significant difference between T m and ΔH of WT and K115E (Fig. 4C,D and Table 1). Both CD and DSC results indicated that the mutation of Lys115 to Glu did not significantly alter the folding of EC1-4 and EC1-2. Therefore, to understand the cause of the decrease in homodimerization tendency caused by the mutation, we next employed MD simulations to investigate the difference between WT and K115E at the atomic level.  homodimer and compared them with the previous data of the simulations of the EC1-4WT homodimer (Supplementary Table S1) 9 . A mutant structure of Lys115 to Glu in the EC1-4 homodimer was generated using CHARMM-GUI 24,25 . Three independent simulations were conducted at different initial velocities. The convergence of the trajectories was confirmed by calculating the root mean square deviation (RMSD) values of the Cα atoms of each domain, as described in the Experimental Procedures section (Supplementary Figure S2).  www.nature.com/scientificreports/ In all simulations, the two chains did not fully dissociate throughout the simulation time. However, when focusing on the dynamics at the interface, clear differences were observed in one of the simulations for the mutant (K115Edimer-Run3). In K115Edimer-Run3, Cα-RMSD of chain B after superposing Cα of chain A indicated that chain B moved drastically with respect to chain A after approximately 100 ns of simulation time while preserving the homodimer structure (Fig. 5A,B). Visual inspection of K115Edimer-Run3 showed that part of the interaction observed at the interface of the crystal structure of the homodimer was abolished in the middle of the simulation, whereas the native interactions remained preserved in the simulations of WT and the other two simulations of K115E (Movies S1 and S2). The switching of the hydrophobic interaction partner at the homodimer interface was observed only in K115Edimer-Run3 (Supplementary Figures S3-S4). This dynamic character at the interface observed in K115Edimer-Run3 may reflect the result by cell aggregation assays that K115E-CHO was less likely to form large aggregates than WT-CHO, and the weaker binding affinity of the homodimer of the mutant than that of the WT, as experimentally measured by SV-AUC.
Causes of differences in the dynamics of the homodimer interface. To further quantify the differences between the WT and the mutant of the EC1-4 homodimer, the angle between EC1 and EC2 was calculated. The angle between the Cα atoms of Asp118, Asn120, and Asn122, which comprise the linker between EC1 and EC2, was defined (Fig. 5C). The EC1-4K115E homodimer exhibited a slightly larger angle than that of the EC1-4WT homodimer (Fig. 5D).
Next, we investigated the potential source of the change at the interface by analyzing the interaction network around the mutation site (115th residue). First, the distance between the center of gravity of the side chain of the residue at position 115 (Lys115 or Glu115) and Lys117 in chain A of the homodimer was analyzed; the distance was approximately 1.5 Å smaller in K115E (Fig. 6). As expected, the distance between OE1/OE2 of Glu115 (Glu115-OE1/OE2) and NZ of Lys117 (Lys117-NZ) in the mutant homodimer was smaller than that between the NZ of Lys115 (Lys115-NZ) and Lys117-NZ in the WT homodimer (Supplementary Figure S5). This is most likely due to the attractive interaction between negatively charged Glu115-OE1/OE2 and positively charged Lys117-NZ in the mutant and the repulsive interaction between the positively charged Lys115-NZ and Lys117-NZ in the WT. The distance between Glu115-OE1 and Lys117-NZ was ≤ 4 Å in around a quarter of the trajectories of the simulations of the EC1-4K115E homodimer, suggesting the formation of a salt bridge between the two charged residues (Supplementary Table S2). The formation of hydrogen bonds between the side chains of Glu115 and Lys117 was also observed in simulations of the mutant homodimer ( Table 2).
The hydrogen bond network between the 115th residue and residues in the adjacent β-strands was also investigated. The number of hydrogen bonds between the 115th residue and residues from the 35th to 39th or  www.nature.com/scientificreports/ from the 90th to 93rd positions was analyzed. The frequency of hydrogen bond formation between the 115th residue and residues from 35th to 39th positions increased in the mutant homodimer (Table 2). In contrast, the number of hydrogen bonds between the 115th residue and residues from the 90th to 93rd positions was significantly decreased by the K115E mutation (Table 2). These changes in the hydrogen bond network appear to have contributed to the difference in the angle between EC1 and EC2, and may explain the slight difference in T m and ΔH of WT and K115E observed in the DSC measurement. Although a clear alteration of the interface dynamics occurred only in K115Edimer-Run3, changes in the hydrogen bond network were observed in two other runs of K115E. These results suggest that the mutation of Lys115 could significantly alter the dynamics of the molecule by slightly adjusting the folding of the molecule.

Discussion
Metastasis is a major factor that increases cancer mortality. Understanding the mechanisms underlying cancer metastasis is essential for developing effective cancer treatment strategies. Although many types of proteins have been identified as key factors that govern cancer metastasis, only a few studies have revealed the mechanism underlying metastasis by focusing on the molecular characteristics of target proteins. Here, we have shown the possible roles of LI-cadherin in lymph node metastasis of colorectal cancer by analyzing how amino acid changes caused by SNPs in the LI-cadherin-coding gene affect cell aggregation ability and the conformation of the molecule.
Our data showed that the mutations K115E and E739A decreased cell aggregation. In vitro and in silico analyses have shown that the mutation K115E decreases homodimerization tendency, possibly due to changes in the hydrogen bond network around the 115th residue. Although, as compared to E739A, the effect of the K115E mutation on cell aggregation ability was smaller, the fact that the results of both in vitro (SV-AUC) and in silico (MD simulations) analyses were consistent with that of the cell aggregation assays implied that it was also important to focus on the mutation of K115E. Collectively, as a mechanism of how SNPs in the LI-cadherin gene increase the risk of colorectal cancer metastasis, we propose the mechanism shown in Fig. 7. As LI-cadherindependent cell-cell adhesion is achieved by homodimerization of LI-cadherin, cell-cell adhesion is weakened by a decrease in homodimerization tendency. The weakened cell-cell adhesion ability leads to a higher risk of cancer cell migration from the primary tumor, which leads to a higher risk of metastasis.
A previous study on the genomic analysis of colorectal cancer patients has shown that the G/G genotype or G allele of c.343 and C/C genotype or C allele of c.2216 increase the risk of lymph node metastasis. However, when another statistical method was employed, no difference in lymph node metastasis was observed between the G allele and A allele of c.343, whereas the difference was still observed between the C allele and A allele of c.2216 18 . Our cell aggregation assay data showed that the Glu739 mutation weakened cell aggregation ability more than the Lys115 mutation, suggesting that the impact of the Lys115 mutation is also smaller in the human body. We speculated that our in vitro and in silico assays were able to capture subtle effects induced by the mutation, which may reflect the results of a previous study showing that the results of the statistical assessment differed between the methods employed. We also speculated that smaller cell aggregates may have a greater chance to www.nature.com/scientificreports/ metastasize through thin lymph vessels than larger aggregates. While large cell aggregates would extend the boundaries of the tumor, smaller aggregates may be transported through lymph vessels and trapped at the lymph nodes, providing those aggregated cells with sufficient time to undergo the mesenchymal to epithelial transition necessary to initiate metastasis. The contribution of LI-cadherin to cancer progression depends on the tumor type. A study using the human colorectal adenocarcinoma cell line LoVo showed that knockdown of LI-cadherin increased the invasion and metastatic potency of LoVo cells 26 . In contrast, knockdown of the LI-cadherin gene in the mouse pancreatic ductal adenocarcinoma cell line Panc02-H7 suppressed the cell proliferation in vitro and orthotopic tumor growth in vivo 4 . This was also the case for E-cadherin expression. Whereas loss of cell-cell adhesion ability by the downregulation of E-cadherin by epithelial-mesenchymal transition is often observed in cancer cells 27,28 , it has also been reported that E-cadherin promotes metastasis in various invasive ductal carcinoma models by limiting apoptosis mediated by reactive oxygen species 29 . Our results suggest that in cancer cells in which LIcadherin suppresses metastasis, molecules that strengthen LI-cadherin-dependent cell-cell adhesion can be used to suppress the migration of cancer cells by maintaining LI-cadherin-dependent cell-cell adhesion.
To the best of our knowledge, this is the first report to describe the structural basis for the increased risk of cancer metastasis caused by amino acid changes in cell-cell adhesion molecules. Cancer metastasis is a complex process involving the invasion of primary tumor, its intravasation into blood vessels and lymph nodes, circulation within blood vessels and lymph nodes, extravasation, and settlement at the metastatic site 30 . In the lymph node metastasis of colorectal cancer focused on in this study, we believe that not only LI-cadherin-dependent cell-cell adhesion, but also various other factors, such as interaction between LI-cadherin and other cell surface proteins or extracellular matrix components and intracellular signal transduction, may contribute. Although further studies, including the use of in vivo models, are needed to fully describe the mechanisms by which SNPs on the LI-cadherin gene increase the risk of lymph node metastasis in colorectal cancer, our study highlighted the contribution of cell-cell adhesion molecules to the complex interaction network of cancer metastasis, suggesting that molecules targeting cell-cell adhesion proteins may have the potential to inhibit cancer metastasis.

Methods
Expression and purification of recombinant LI-cadherin. All LI-cadherin constructs were expressed as previously described 9 . Briefly, Expi293F cells (Thermo Fisher Scientific) were transfected with the pcDNA3.4 vector encoding LI-cadherin sequence and C-terminal Myc-tag, NSAVD sequence, and 6xHis-tag , following the manufacturer's protocol. Cells were cultured at 37 °C and 8.0% CO 2 for three days after transfection. The supernatant was collected by centrifuging the cell culture for 15 min at 1500 rpm and was dialyzed against a solution of 20 mM Tris-HCl (pH 8.0), 300 mM NaCl, and 3 mM CaCl 2 . Proteins were purified by immobilized metal affinity chromatography using Ni-NTA Agarose (Qiagen), followed by size exclusion chromatography using the HiLoad 26/600 Superdex 200 pg column (Cytiva) at 4 °C equilibrated in buffer A (10 mM HEPES-NaOH (pH 7.5), 150 mM NaCl, and 3 mM CaCl 2 ). Unless otherwise indicated, samples were dialyzed in buffer A before analysis, and the filtered dialysis buffer was used for assays.
Site-directed mutagenesis. Introduction of mutations in plasmids was performed as described previously 31 . The plasmid for EC1-2WT expression was prepared by introducing the mutation to the plasmid for EC1-2K115E expression using forward primer 5′-GTG AAG GAC ATC AAC GAC AAT CGA CCC-3′ and reverse primer 5′-CTT TAT GGT GAT AGG GAC TGG ACC CTCC-3′. The plasmids for EC1-4K115E expression and the establishment of K115E-CHO were prepared by introducing the mutation to the plasmids for EC1-4WT expression and the establishment of WT-CHO, respectively, using forward primer 5′-GAG GTG AAG GAC ATC AAC GAC AAC CGTC-3′ and reverse primer 5′-GAT GGT GAT GGG CAC GGG ACC CTC -3′. The plasmids for the establishment of E739A-CHO and 2mut-CHO were prepared by introducing the mutation to the plasmids for the establishment of WT-CHO and K115E-CHO, respectively, using forward primer 5′-CGT ACG TGG TGC TGA TCC GTA TCA AC-3′ and reverse primer 5′-CAC GCT CCT CGA ACT CGG TGTG-3′. For site-directed mutagenesis other than EC1-2K115E to EC1-2WT, Ligation high Ver. 2 (TOYOBO) was used instead of Ligation high (TOYOBO). Circular dichroism (CD) spectroscopy. CD spectra were measured using a J-820 spectrometer (JASCO) at 25 °C. Spectra from 260 to 200 nm were scanned five times at a scan rate of 20 nm/min. A quartz cuvette, with an optical length of 1 mm, was used. The measurements were performed at protein concentrations of 5 µM (EC1-4WT and EC1-4K115E) and 10 µM (EC1-2WT and EC1-2K115E). The spectrum of the buffer was subtracted from that of each protein sample. Data are represented as mean residue ellipticities by normalizing the subtracted spectra with protein concentration and the number of amino acid residues comprising the protein.
MD simulations. MD simulations of LI-cadherin were performed as previously described 9 . In brief, GROMACS 2016.3 32 with a CHARMM36m force field 33 was used. The crystal structure of the EC1-4WT homodimer (PDB: 7CYM) was used as the initial structure. The sugar chains were removed from the original crystal structure. Missing residues were modelled using MODELLER 9.18 34 . The addition of N-linked sugar chains (G0F) and the mutation of Lys115 to Glu115 was performed using CHARMM-GUI 24 www.nature.com/scientificreports/ were solvated with TIP3P water 35 in a rectangular box such that the minimum distance to the edge of the box was 15 Å under periodic boundary conditions using CHARMM-GUI 24 . Each system was energy-minimized for 5000 steps and equilibrated with the NVT ensemble (298 K) for 1 ns. Further simulations were performed using the NPT ensemble at 298 K. The time step was set as 2 fs throughout the simulation. A snapshot was recorded every 10 ps. All trajectories were analyzed using the GROMACS tools.

Sedimentation velocity analytical ultracentrifugation (SV-AUC).
The detailed protocol for SV-AUC has been described in a previous report 9 . Briefly, the Optima AUC (Beckman Coulter) equipped with an 8-hole An50 Ti rotor was used. Measurements were performed at 20 °C with 1, 2.5, 5, 10, 20, 40, 50, 80, and 120 µM of EC1-4K115E or 1, 2.5, 5, 10, 20, 40, and 60 µM of EC1-2K115E. A protein sample (390 µL) was loaded into the sample sector of a cell equipped with sapphire windows and 12 mm double-sector charcoal-filled on the centerpiece. Buffer (400 µL) was loaded into the reference sector of each cell. Data were collected at 42,000 rpm with a radial increment of 10 µm, using a UV detection system.

Establishment of CHO cells expressing LI-cadherin mutants.
CHO cells expressing LI-cadherin mutants were established as described previously 9 . The DNA sequence of the LI-cadherin constructs fused with monomeric GFP was cloned into the pcDNA5/FRT vector (Thermo Fisher Scientific). The Flp-In-CHO Cell Line (Thermo Fisher Scientific) was transfected with the plasmid, following the manufacturer's protocol. Cloning was performed using the limiting dilution-culture method. Cells were observed using an In Cell Analyzer 2000 instrument (Cytiva) with the FITC filter (490/20 excitation, 525/36 emission), and the cells expressing GFP were selected. The cells were cultivated in Ham's F-12 Nutrient Mixture (ThermoFisher Scientific) supplemented with 10% fetal bovine serum (FBS), 1% L-glutamine, 1% GlutaMAX-I (ThermoFisher Scientific), 1% penicillin-streptomycin, and 0.5 mg mL −1 hygromycin B at 37 °C and 5.0% CO 2 .
Cell imaging. After establishing the cell lines expressing LI-cadherin, the expression of LI-cadherin in cells was further confirmed by observing the fluorescence of GFP. The protocols for imaging the cells have been previously described 9 .
Cell aggregation assay. The cell aggregation assays were performed by modifying the methods described in previous reports 21,22 . The detailed protocol has been previously described 9 . Briefly, LI-cadherin expressing cells were detached from the plate using 0.01% trypsin and washed with 1 × HMF supplemented with 20% FBS. Cells were suspended in 1 × HMF and 5 × 10 4 cells were loaded into a 24-well plate coated with 1% w/v BSA. EDTA was added if necessary. After incubating the plate at room temperature for 5 min, 24-well plate was placed on a shaker and rotated at 80 rpm for 60 min at 37 °C.
Micro-flow Imaging (MFI). MFI (Brightwell Technologies) was used to count the particle number and visualize the cell aggregates after the cell aggregation assays as described previously 9 . After the cell aggregation assay procedure described above, the plate was incubated at room temperature for 10 min, followed by the addition of 500 µL of 4%-Paraformaldehyde Phosphate Buffer Solution (Nacalai Tesque) to each well. The plate was then incubated on ice for more than 20 min. The MFI View System Software was used for measurements and analyses. The instrument was flushed with detergent and ultrapure water before the experiments. The cleanliness of the flow channel was checked by performing measurements using ultrapure water and confirming the presence of less than 100 particles/mL. The flow path was washed with 1 × HMF before the samples were measured. The purge volume and analyzed volume were 200 and 420 µL, respectively. Optimize Illumination was performed prior to each measurement. The number of particles that were 1 µm or larger and less than 100 µm in size was counted.

Data availability
All data generated or analyzed during this study are included in this manuscript and its Supplementary Information files. www.nature.com/scientificreports/ Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.